72 research outputs found
Algebraic geometry in experimental design and related fields
The thesis is essentially concerned with two subjects corresponding to the two grants under which the author was research assistant in the last three years. The one presented first, which cronologically comes second, addresses the issues of iden- tifiability for polynomial models via algebraic geometry and leads to a deeper understanding of the classical theory. For example the very recent introduction of the idea of the fan of an experimental design gives a maximal class of models identifiable with a given design. The second area develops a theory of optimum orthogonal fractions for Fourier regression models based on integer lattice designs. These provide alternatives to product designs. For particular classes of Fourier models with a given number of interactions the focus is on the study of orthogonal designs with attention given to complexity issues as the dimension of the model increases. Thus multivariate identifiability is the field of concern of the thesis. A major link between these two parts is given by Part III where the algebraic approach to identifiability is extended to Fourier models and lattice designs. The approach is algorithmic and algorithms to deal with the various issues are to be found throughout the thesis.
Both the application of algebraic geometry and computer algebra in statistics and the analysis of orthogonal fractions for Fourier models are new and rapidly growing fields. See for example the work by Koval and Schwabe (1997) [42] on qualitative Fourier models, Shi and Fang (1995) [67] on ¿/-designs for Fourier regression and Dette and Haller (1997) [25] on one-dimensional incomplete Fourier models. For algebraic geometry in experimental design see Fontana, Pistone and Rogantin (1997) [31] on two-level orthogonal fractions, Caboara and Robbiano (1997) [15] on the inversion problem and Robbiano and Rogantin (1997) [61] on distracted fractions. The only previous extensive application of algebraic geometry in statistics is the work of Diaconis and Sturmfels (1993) [27] on sampling from conditional distributions
Two polynomial representations of experimental design
In the context of algebraic statistics an experimental design is described by
a set of polynomials called the design ideal. This, in turn, is generated by
finite sets of polynomials. Two types of generating sets are mostly used in the
literature: Groebner bases and indicator functions. We briefly describe them
both, how they are used in the analysis and planning of a design and how to
switch between them. Examples include fractions of full factorial designs and
designs for mixture experiments.Comment: 13 page
A geometric characterisation of sensitivity analysis in monomial models
Sensitivity analysis in probabilistic discrete graphical models is usually
conducted by varying one probability value at a time and observing how this
affects output probabilities of interest. When one probability is varied then
others are proportionally covaried to respect the sum-to-one condition of
probability laws. The choice of proportional covariation is justified by a
variety of optimality conditions, under which the original and the varied
distributions are as close as possible under different measures of closeness.
For variations of more than one parameter at a time proportional covariation is
justified in some special cases only. In this work, for the large class of
discrete statistical models entertaining a regular monomial parametrisation, we
demonstrate the optimality of newly defined proportional multi-way schemes with
respect to an optimality criterion based on the notion of I-divergence. We
demonstrate that there are varying parameters choices for which proportional
covariation is not optimal and identify the sub-family of model distributions
where the distance between the original distribution and the one where
probabilities are covaried proportionally is minimum. This is shown by adopting
a new formal, geometric characterization of sensitivity analysis in monomial
models, which include a wide array of probabilistic graphical models. We also
demonstrate the optimality of proportional covariation for multi-way analyses
in Naive Bayes classifiers
Discovery of statistical equivalence classes using computer algebra
Discrete statistical models supported on labelled event trees can be
specified using so-called interpolating polynomials which are generalizations
of generating functions. These admit a nested representation. A new algorithm
exploits the primary decomposition of monomial ideals associated with an
interpolating polynomial to quickly compute all nested representations of that
polynomial. It hereby determines an important subclass of all trees
representing the same statistical model. To illustrate this method we analyze
the full polynomial equivalence class of a staged tree representing the best
fitting model inferred from a real-world dataset.Comment: 26 pages, 9 figure
Minimal average degree aberration and the state polytope for experimental designs
For a particular experimental design, there is interest in finding which
polynomial models can be identified in the usual regression set up. The
algebraic methods based on Groebner bases provide a systematic way of doing
this. The algebraic method does not in general produce all estimable models but
it can be shown that it yields models which have minimal average degree in a
well-defined sense and in both a weighted and unweighted version. This provides
an alternative measure to that based on "aberration" and moreover is applicable
to any experimental design. A simple algorithm is given and bounds are derived
for the criteria, which may be used to give asymptotic Nyquist-like
estimability rates as model and sample sizes increase
- …